---
sidebar_position: 1
---

# Streaming

Some LLMs provide a streaming response. This means that instead of waiting for the entire response to be returned, you can start processing it as soon as it's available. This is useful if you want to display the response to the user as it's being generated, or if you want to process the response as it's being generated.

## Using `.stream()`

import CodeBlock from "@theme/CodeBlock";

The easiest way to stream is to use the `.stream()` method. This returns an readable stream that you can also iterate over:

import StreamMethodExample from "@examples/models/llm/llm_streaming_stream_method.ts";

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm install @langchain/openai
```

<CodeBlock language="typescript">{StreamMethodExample}</CodeBlock>

For models that do not support streaming, the entire response will be returned as a single chunk.

## Using a callback handler

You can also use a [`CallbackHandler`](https://github.com/langchain-ai/langchainjs/blob/main/langchain/src/callbacks/base.ts) like so:

import StreamingExample from "@examples/models/llm/llm_streaming.ts";

<CodeBlock language="typescript">{StreamingExample}</CodeBlock>

We still have access to the end `LLMResult` if using `generate`. However, `token_usage` is not currently supported for streaming.
